Overview

Dataset statistics

Number of variables13
Number of observations2969
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.7 KiB
Average record size in memory104.0 B

Variable types

NUM13

Warnings

qtde_items is highly correlated with gross_revenueHigh correlation
gross_revenue is highly correlated with qtde_itemsHigh correlation
qtde_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 53.4442279) Skewed
frequency is highly skewed (γ1 = 24.88037069) Skewed
qtde_returns is highly skewed (γ1 = 51.79774426) Skewed
avg_basket_size is highly skewed (γ1 = 44.68328098) Skewed
df_index has unique values Unique
customer_id has unique values Unique
avg_ticket has unique values Unique
recency_days has 34 (1.1%) zeros Zeros
qtde_returns has 1481 (49.9%) zeros Zeros

Reproduction

Analysis started2021-06-17 15:11:25.270641
Analysis finished2021-06-17 15:11:52.708147
Duration27.44 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2317.292354
Minimum0
Maximum5715
Zeros1
Zeros (%)< 0.1%
Memory size23.2 KiB
2021-06-17T12:11:52.819050image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.4
Q1929
median2120
Q33537
95-th percentile5035.2
Maximum5715
Range5715
Interquartile range (IQR)2608

Descriptive statistics

Standard deviation1554.944589
Coefficient of variation (CV)0.6710178739
Kurtosis-1.010787014
Mean2317.292354
Median Absolute Deviation (MAD)1271
Skewness0.342284058
Sum6880041
Variance2417852.674
MonotocityStrictly increasing
2021-06-17T12:11:52.944117image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01< 0.1%
 
26541< 0.1%
 
26441< 0.1%
 
5971< 0.1%
 
26461< 0.1%
 
5991< 0.1%
 
26481< 0.1%
 
6011< 0.1%
 
6031< 0.1%
 
51441< 0.1%
 
Other values (2959)295999.7%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
57151< 0.1%
 
56961< 0.1%
 
56861< 0.1%
 
56801< 0.1%
 
56591< 0.1%
 

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.77299
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:53.115980image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.4
Q113799
median15221
Q316768
95-th percentile17964.6
Maximum18287
Range5940
Interquartile range (IQR)2969

Descriptive statistics

Standard deviation1718.990292
Coefficient of variation (CV)0.1125673398
Kurtosis-1.206094692
Mean15270.77299
Median Absolute Deviation (MAD)1488
Skewness0.03160785866
Sum45338925
Variance2954927.624
MonotocityNot monotonic
2021-06-17T12:11:53.481215image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
163841< 0.1%
 
181641< 0.1%
 
129331< 0.1%
 
129351< 0.1%
 
149841< 0.1%
 
170331< 0.1%
 
137041< 0.1%
 
129391< 0.1%
 
170371< 0.1%
 
141251< 0.1%
 
Other values (2959)295999.7%
 
ValueCountFrequency (%) 
123471< 0.1%
 
123481< 0.1%
 
123521< 0.1%
 
123561< 0.1%
 
123581< 0.1%
 
ValueCountFrequency (%) 
182871< 0.1%
 
182831< 0.1%
 
182821< 0.1%
 
182771< 0.1%
 
182761< 0.1%
 

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2963
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2749.226056
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:53.574955image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.77
Q1570.96
median1086.92
Q32308.06
95-th percentile7219.68
Maximum279138.02
Range279131.82
Interquartile range (IQR)1737.1

Descriptive statistics

Standard deviation10580.4905
Coefficient of variation (CV)3.848534202
Kurtosis353.9585684
Mean2749.226056
Median Absolute Deviation (MAD)672.72
Skewness16.77787915
Sum8162452.16
Variance111946779.3
MonotocityNot monotonic
2021-06-17T12:11:53.699952image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
745.0620.1%
 
33120.1%
 
734.9420.1%
 
379.6520.1%
 
533.3320.1%
 
731.920.1%
 
889.931< 0.1%
 
471.511< 0.1%
 
13375.871< 0.1%
 
284.461< 0.1%
 
Other values (2953)295399.5%
 
ValueCountFrequency (%) 
6.21< 0.1%
 
13.31< 0.1%
 
151< 0.1%
 
36.561< 0.1%
 
451< 0.1%
 
ValueCountFrequency (%) 
279138.021< 0.1%
 
259657.31< 0.1%
 
194550.791< 0.1%
 
168472.51< 0.1%
 
140438.721< 0.1%
 

recency_days
Real number (ℝ≥0)

ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.28864938
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Memory size23.2 KiB
2021-06-17T12:11:53.825254image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.75617089
Coefficient of variation (CV)1.209485215
Kurtosis2.778038567
Mean64.28864938
Median Absolute Deviation (MAD)26
Skewness1.798396863
Sum190873
Variance6046.022112
MonotocityNot monotonic
2021-06-17T12:11:53.950241image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1993.3%
 
4872.9%
 
2852.9%
 
3852.9%
 
8762.6%
 
10672.3%
 
7662.2%
 
9662.2%
 
17642.2%
 
22551.9%
 
Other values (262)221974.7%
 
ValueCountFrequency (%) 
0341.1%
 
1993.3%
 
2852.9%
 
3852.9%
 
4872.9%
 
ValueCountFrequency (%) 
37320.1%
 
37240.1%
 
3711< 0.1%
 
3681< 0.1%
 
36640.1%
 

qtde_invoices
Real number (ℝ≥0)

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.72280229
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:54.086563image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.85665393
Coefficient of variation (CV)1.547607882
Kurtosis190.8253633
Mean5.72280229
Median Absolute Deviation (MAD)2
Skewness10.76645634
Sum16991
Variance78.44031883
MonotocityNot monotonic
2021-06-17T12:11:54.242742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
278626.5%
 
349816.8%
 
439313.2%
 
52378.0%
 
11906.4%
 
61735.8%
 
71384.6%
 
8983.3%
 
9692.3%
 
10551.9%
 
Other values (46)33211.2%
 
ValueCountFrequency (%) 
11906.4%
 
278626.5%
 
349816.8%
 
439313.2%
 
52378.0%
 
ValueCountFrequency (%) 
2061< 0.1%
 
1991< 0.1%
 
1241< 0.1%
 
971< 0.1%
 
9120.1%
 

qtde_items
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1665
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1606.461098
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:54.416120image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101.4
Q1296
median639
Q31399
95-th percentile4407.4
Maximum196844
Range196843
Interquartile range (IQR)1103

Descriptive statistics

Standard deviation5882.976527
Coefficient of variation (CV)3.6620722
Kurtosis467.153716
Mean1606.461098
Median Absolute Deviation (MAD)420
Skewness17.87844459
Sum4769583
Variance34609412.81
MonotocityNot monotonic
2021-06-17T12:11:54.541111image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
310110.4%
 
8890.3%
 
15090.3%
 
24680.3%
 
8480.3%
 
26080.3%
 
28880.3%
 
27280.3%
 
20070.2%
 
13470.2%
 
Other values (1655)288697.2%
 
ValueCountFrequency (%) 
11< 0.1%
 
220.1%
 
1220.1%
 
161< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
1968441< 0.1%
 
809971< 0.1%
 
799631< 0.1%
 
773731< 0.1%
 
699931< 0.1%
 

qtde_products
Real number (ℝ≥0)

Distinct469
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.705288
Minimum1
Maximum7837
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:54.654659image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7837
Range7836
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.8419967
Coefficient of variation (CV)2.199106503
Kurtosis354.8373546
Mean122.705288
Median Absolute Deviation (MAD)44
Skewness15.70613971
Sum364312
Variance72814.70321
MonotocityNot monotonic
2021-06-17T12:11:54.781606image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
28451.5%
 
20381.3%
 
35351.2%
 
19331.1%
 
15331.1%
 
29331.1%
 
11321.1%
 
26311.0%
 
27301.0%
 
16291.0%
 
Other values (459)263088.6%
 
ValueCountFrequency (%) 
160.2%
 
2140.5%
 
3160.5%
 
4170.6%
 
5260.9%
 
ValueCountFrequency (%) 
78371< 0.1%
 
56701< 0.1%
 
50951< 0.1%
 
45771< 0.1%
 
26981< 0.1%
 

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.90005685
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:54.922214image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.916661099
Q113.11933333
median17.97438356
Q324.98828571
95-th percentile90.497
Maximum56157.5
Range56155.34941
Interquartile range (IQR)11.86895238

Descriptive statistics

Standard deviation1036.934336
Coefficient of variation (CV)19.9794451
Kurtosis2890.70744
Mean51.90005685
Median Absolute Deviation (MAD)5.994222271
Skewness53.4442279
Sum154091.2688
Variance1075232.818
MonotocityNot monotonic
2021-06-17T12:11:55.015949image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
17.492758621< 0.1%
 
33.535714291< 0.1%
 
17.628961751< 0.1%
 
28.899687941< 0.1%
 
46.074130431< 0.1%
 
25.775384621< 0.1%
 
8.7451724141< 0.1%
 
18.150615381< 0.1%
 
17.943444441< 0.1%
 
15.98451< 0.1%
 
Other values (2959)295999.7%
 
ValueCountFrequency (%) 
2.1505882351< 0.1%
 
2.43251< 0.1%
 
2.4623711341< 0.1%
 
2.5112413791< 0.1%
 
2.5153333331< 0.1%
 
ValueCountFrequency (%) 
56157.51< 0.1%
 
4453.431< 0.1%
 
3202.921< 0.1%
 
1687.21< 0.1%
 
952.98751< 0.1%
 

avg_recency_days
Real number (ℝ≥0)

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.35143043
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:55.212525image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.92857143
median48.28571429
Q385.33333333
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.4047619

Descriptive statistics

Standard deviation63.54282948
Coefficient of variation (CV)0.9434518178
Kurtosis4.887703174
Mean67.35143043
Median Absolute Deviation (MAD)26.28571429
Skewness2.062908983
Sum199966.397
Variance4037.691178
MonotocityNot monotonic
2021-06-17T12:11:55.396748image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
14250.8%
 
4220.7%
 
70210.7%
 
7200.7%
 
35190.6%
 
49180.6%
 
46170.6%
 
11170.6%
 
21170.6%
 
1160.5%
 
Other values (1248)277793.5%
 
ValueCountFrequency (%) 
1160.5%
 
1.51< 0.1%
 
2130.4%
 
2.51< 0.1%
 
2.6013986011< 0.1%
 
ValueCountFrequency (%) 
3661< 0.1%
 
3651< 0.1%
 
3631< 0.1%
 
3621< 0.1%
 
35720.1%
 

frequency
Real number (ℝ≥0)

SKEWED

Distinct1225
Distinct (%)41.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1137912226
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:55.521800image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008894164194
Q10.01633986928
median0.02588996764
Q30.04941860465
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.03307873537

Descriptive statistics

Standard deviation0.4081571514
Coefficient of variation (CV)3.586894861
Kurtosis989.3578171
Mean0.1137912226
Median Absolute Deviation (MAD)0.0121913375
Skewness24.88037069
Sum337.8461398
Variance0.1665922603
MonotocityNot monotonic
2021-06-17T12:11:55.615535image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11986.7%
 
0.02777777778170.6%
 
0.0625170.6%
 
0.02380952381160.5%
 
0.08333333333150.5%
 
0.09090909091150.5%
 
0.03448275862140.5%
 
0.02941176471140.5%
 
0.03571428571130.4%
 
0.02564102564130.4%
 
Other values (1215)263788.8%
 
ValueCountFrequency (%) 
0.0054495912811< 0.1%
 
0.0054644808741< 0.1%
 
0.0054794520551< 0.1%
 
0.0054945054951< 0.1%
 
0.00558659217920.1%
 
ValueCountFrequency (%) 
171< 0.1%
 
31< 0.1%
 
260.2%
 
1.1428571431< 0.1%
 
11986.7%
 

qtde_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct214
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.1569552
Minimum0
Maximum80995
Zeros1481
Zeros (%)49.9%
Memory size23.2 KiB
2021-06-17T12:11:55.725416image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100.6
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1512.496135
Coefficient of variation (CV)24.33349783
Kurtosis2765.52864
Mean62.1569552
Median Absolute Deviation (MAD)1
Skewness51.79774426
Sum184544
Variance2287644.557
MonotocityNot monotonic
2021-06-17T12:11:55.850394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0148149.9%
 
11645.5%
 
21485.0%
 
31053.5%
 
4893.0%
 
6782.6%
 
5612.1%
 
12511.7%
 
7431.4%
 
8431.4%
 
Other values (204)70623.8%
 
ValueCountFrequency (%) 
0148149.9%
 
11645.5%
 
21485.0%
 
31053.5%
 
4893.0%
 
ValueCountFrequency (%) 
809951< 0.1%
 
90141< 0.1%
 
80041< 0.1%
 
44271< 0.1%
 
37681< 0.1%
 

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1973
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.349541
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:55.975653image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.25
median172
Q3281.5
95-th percentile599.52
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.25

Descriptive statistics

Standard deviation791.5024106
Coefficient of variation (CV)3.174268569
Kurtosis2256.245507
Mean249.349541
Median Absolute Deviation (MAD)82.75
Skewness44.68328098
Sum740318.7873
Variance626476.066
MonotocityNot monotonic
2021-06-17T12:11:56.148714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
100110.4%
 
114100.3%
 
8290.3%
 
8690.3%
 
7390.3%
 
13680.3%
 
8880.3%
 
6080.3%
 
7580.3%
 
16370.2%
 
Other values (1963)288297.1%
 
ValueCountFrequency (%) 
120.1%
 
21< 0.1%
 
3.3333333331< 0.1%
 
5.3333333331< 0.1%
 
5.6666666671< 0.1%
 
ValueCountFrequency (%) 
40498.51< 0.1%
 
6009.3333331< 0.1%
 
42821< 0.1%
 
39061< 0.1%
 
3868.651< 0.1%
 

avg_unique_basket_size
Real number (ℝ≥0)

Distinct1010
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.15507374
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-17T12:11:56.325508image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.345454545
Q110
median17.2
Q327.75
95-th percentile56.94
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.51303316
Coefficient of variation (CV)0.8807478316
Kurtosis27.69469772
Mean22.15507374
Median Absolute Deviation (MAD)8.2
Skewness3.498252107
Sum65778.41393
Variance380.7584629
MonotocityNot monotonic
2021-06-17T12:11:56.467434image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13531.8%
 
14401.3%
 
11381.3%
 
9331.1%
 
20331.1%
 
1321.1%
 
18311.0%
 
10301.0%
 
16291.0%
 
17280.9%
 
Other values (1000)262288.3%
 
ValueCountFrequency (%) 
1321.1%
 
1.21< 0.1%
 
1.251< 0.1%
 
1.33333333320.1%
 
1.580.3%
 
ValueCountFrequency (%) 
299.70588241< 0.1%
 
2591< 0.1%
 
203.51< 0.1%
 
1481< 0.1%
 
1451< 0.1%
 

Interactions

2021-06-17T12:11:30.789373image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:30.914411image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.025350image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.150332image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.244065image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.353427image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.447530image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.541271image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.635012image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.729055image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:31.864681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.014499image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.140532image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.256799image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.350554image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.569265image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.678806image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.789744image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:32.941971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.089241image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.214309image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.323682image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.464286image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.589265image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.683008image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.777731image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:33.887028image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.012883image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.126817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.252015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.345764image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.470820image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.564554image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.673913image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.784306image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:34.893668image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.018712image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.150872image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.252407image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.408643image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.518077image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.627437image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.752421image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:35.892962image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.018540image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.297029image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.453367image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.562821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.704764image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.798587image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:36.939121image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.079731image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.194551image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.324787image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.424529image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.565381image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.674837image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.755124image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.864498image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:37.989495image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.083180image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.221980image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.311603image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.461203image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.570489image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.648603image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.757972image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:38.882988image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.008009image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.164248image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.289935image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.494842image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.651715image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.761073image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:39.886104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.011164image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.152189image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.302967image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.444819image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.538568image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.632298image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:40.915867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.025355image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.166041image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.337820image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.463323image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.557365image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.714890image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.839867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:41.964847image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.108594image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.233516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.390753image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.515756image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.636389image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.749118image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.827235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:42.952603image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.061968image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.171329image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.280769image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.374431image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.491391image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.597929image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.722958image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.816695image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:43.926055image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.067691image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.177129image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.270874image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.365412image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.492139image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.634827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.770887image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.864550image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:44.989537image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.098900image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.234503image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.375200image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.482218image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.647445image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.789380image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:45.930611image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:46.040433image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:46.191845image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:46.489872image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:46.630407image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:46.741505image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:46.889598image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.015050image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.155655image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.249403image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.343140image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.436882image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.577420image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.708949image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.850156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:47.987287image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.112359image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.237361image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.411089image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.520444image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.614118image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.740928image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:48.912937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.037976image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.162961image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.288029image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.381776image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.506750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.600495image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.773411image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:49.936882image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.098276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.228895image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.402098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.495837image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.636453image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.747788image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.857078image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:50.997686image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.144113image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.253564image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.378561image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.472296image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.581650image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.692100image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.817172image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:51.941512image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-06-17T12:11:56.607951image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-17T12:11:56.797048image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-17T12:11:56.984941image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-17T12:11:57.156798image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-17T12:11:52.246498image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-17T12:11:52.559920image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.0034.001733.00297.0018.1535.5017.0040.0050.978.74
11130473232.5956.009.001390.00171.0018.9027.250.0335.00154.4419.00
22125836705.382.0015.005028.00232.0028.9023.190.0450.00335.2015.47
3313748948.2595.005.00439.0028.0033.8792.670.020.0087.805.60
4415100876.00333.003.0080.003.00292.008.600.0722.0026.671.00
55152914623.3025.0014.002102.00102.0045.3323.200.0429.00150.147.29
66146885630.877.0021.003621.00327.0017.2218.300.06399.00172.4315.57
77178095411.9116.0012.002057.0061.0088.7235.700.0341.00171.425.08
881531160767.900.0091.0038194.002379.0025.544.140.24474.00419.7126.14
99160982005.6387.007.00613.0067.0029.9347.670.020.0087.579.57

Last rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
29595627177271060.2515.001.00645.0066.0016.066.001.006.00645.0066.00
2960563717232421.522.002.00203.0036.0011.7112.000.150.00101.5018.00
2961563817468137.0010.002.00116.005.0027.404.000.400.0058.002.50
2962564913596697.045.002.00406.00166.004.207.000.250.00203.0083.00
29635655148931237.859.002.00799.0073.0016.962.000.670.00399.5036.50
2964565912479473.2011.001.00382.0030.0015.774.001.0034.00382.0030.00
2965568014126706.137.003.00508.0015.0047.083.000.7550.00169.335.00
29665686135211092.391.003.00733.00435.002.514.500.300.00244.33145.00
2967569615060301.848.004.00262.00120.002.521.002.000.0065.5030.00
2968571512558269.967.001.00196.0011.0024.546.001.00196.00196.0011.00